Thermodynamic Sampling of Molecular Conformations 



Andreas Kramer* 



January 8, 2004 



Abstract 

Torsional-space Monte Carlo simulations of flexible molecules are usually based on 
the assumption that all values of dihedral angles have equal probability in the absence 
of atomic interactions. In the present paper it is shown that this assumption is not 
valid. Thermodynamic sampling using dihedral angles or other internal coordinates has 
to account for both the correct metric in conformational space and the conformation- 
dependence of the moment of inertia tensor. Metric and moment of inertia terms 
appear as conformation-dependent factors in the partition function and are obtained by 
proper separation of internal and rotational degrees of freedom. The importance of both 
factors is discussed for a number of short peptides as well as for the folded and unfolded 
states of a protein. It is concluded that thermodynamic Monte Carlo simulations of 
protein folding that neglect these correction factors tend to underestimate the stability 
of the folded state. 
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1 Introduction 



Many organic molecules are able to adopt a large number of different conformations at room 
temperature which has far reaching consequences for their thermodynamic and biochemical 
properties. In order to separate rotational and translational degrees of freedom, molecular 
conformations are generally expressed in terms of internal coordinates where it is often 
sufficient to describe the conformational state of a molecule by a set of dihedral angles 
since the effect of high-frequency bond angle and length fluctuations can approximately be 
included in the potential energy p. 

Dihedral angle coordinates are frequently used to sample molecular conformations in 
Monte Carlo simulations of biomolecules (see for example 12121111 EI)- These thermodynamic 
simulations are usually based on the assumption that the volume element appearing in the 
partition function is given by YiiLi dcpi where (pi are dihedral angles and M is the number 
of rotatable bonds. This means that in the absence of atomic interactions all values of 
dihedral angles are sampled with equal probability. In the present paper it is shown that 
this underlying assumption is generally not valid, and that the proper metric in internal 
coordinate space as well as effects arising from the conformation-dependence of the moment 
of inertia tensor have to be taken into account in order to accurately calculate thermodynamic 
quantities. 

The main mathematical problem in a formulation of the statistical mechanics in the 
space of molecular conformations (which we will call briefly shape space or shape manifold 
in the following) poses the proper separation of motions in internal coordinates (or shape 
coordinates) and rotations. This is non-trivial since it is not possible to define a body frame 
of reference in a unique way as it can be done for a rigid body. In fact, one can choose an 
arbitrary coordinate system for each different conformation. It is known that this freedom 
in the choice of the coordinate system can be expressed in terms of a gauge potential [HI Ej 
where local gauge transforms correspond to independent rotations of coordinate systems, 
hence the gauge symmetry group is SO (3). An interesting property of systems with internal 
degrees of freedom is the possibility to generate a change in orientation solely by the variation 
of shape without the application of an external torque, i.e., by moving through a closed path 
in shape space. The most prominent example for this effect is the ability of a falling cat to 
land on its feet starting from an upside-down position while the total angular momentum 
is zero. Another example is the observation of a slow rotation of the whole system over 
time in zero-angular momentum molecular dynamics simulations of proteins [Sj. This net 
rotation of the system is an example of a so-called geometric phase [HI which is independent 
of the parametrization of the closed path in shape space and also gauge-invariant. In the 
case of three-dimensional molecules the practical calculation of these geometric phases is 
complicated by the fact that the group of rotations, SO (3), is non-Abelian, however, this 
will not be important for the following discussion. 

A deeper mathematical foundation of the subject as well as generalizations can be for- 
mulated using fiber-bundles ^Ul- From a classical mechanical standpoint all these rotation- 
related effects can of course also be discussed in terms of Coriolis forces. A comprehensive 
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recent review on the separation of internal motions and rotations in the A^-body problem 
(with emphasis on gauge fields) has recently been given by Littlejohn and Reinsch 

In the following section we discuss the classical canonical partition function in shape space 
(its detailed derivation is given in the Appendix) and show that it introduces two shape- 
dependent correction factors which are not present in the usual Cartesian-space partition 
function involving atom coordinates. The first factor transforms the volume element in 
shape coordinates to the true volume element on the non-Euclidean shape manifold and 
involves the metric tensor. The second term refiects the conformation-dependence of the 
moment of inertia tensor. Using dihedral angles as shape coordinates the importance of 
these correction terms is discussed in Section 3 for a number of small peptides, Ace-(Ala)n- 
Nme with n = 1,2,3, and the pentapeptide Met-enkephalin. In Section 4 we estimate and 
compare the correction terms that arise in the folded and unfolded states of a protein. 



2 Statistical mechanics on the shape manifold 

We consider a molecule consisting of N atoms with masses rria for which a conformation is 
uniquely described by M shape coordinates q\ i = 1 . . . M. The Cartesian atom coordinates 
are given as functions of the shape coordinates, 

4 = 4(g\...,g^0 (l) 

with a = 1, . . . , A^. The vectors are taken to be center of mass coordinates, i.e. we assume 
that for all tuples (g^, . . . , q^) it is J2a "i^^aCaiq^, ■ ■ ■ , q^^) = 0. The choice of the functions 
Ca is of course not unique because the atom coordinates for a given conformation are only 
determined up to an overall rotation of the molecule. In fact we could replace the functions 
Ca by their arbitrarily rotated versions, 

c«-^R(g\...,g^0-4, (2) 

where the rotation matrix R is an arbitrary function of the shape coordinates. The only 
assumption we make is, that R(g^, . . . , q^) and Ca{q^ , ■ ■ ■ ^ q^) are sufficiently well-behaved 
w.r.t. the existence of derivatives. Because of the shape-dependence of the atom coordi- 
nates, the moment of inertia tensor M (Appendix Eq. (|19|)) is also a function of the shape 
coordinates. It is convenient to write M as a dimensionless quantity, 

M = M(g\ . . . , q'') = ^ A-^ (jc^l^I - 4 ® 4) (3) 

a 

where 



A. 



'27rr/3^ 



1/2 



is the thermal de Broglie wavelength of atom a and (3 = l/ksT is the inverse temperature. 
Here, ® denotes the outer product of three-dimensional vectors. We also define the gauge 
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potential (see Eq. in the Appendix) 



A = AM\..., q'') = m' Y: K'ca X ^, (4) 

which is dimensionless by definition (assuming that the shape coordinates are dimension- 
less), and the metric tensor gij on the shape manifold (Appendix Eq. ()25|1 ) that can be 
written in a dimensionless form as 

= g.,{q\ q'^) = ^ A^^ .^-A^.M- A, (5) 



As discussed in the Appendix, infinitesimal distances ds = gijdq'^dq^ correspond (up to a 
constant prefactor) to mass-weighted root mean square deviations (RMSD) minimized w.r.t. 
rotations. This fact can in principle be used to approximate distances in shape space without 
referring to underlying coordinates. 

The behavior of the gauge potential Ai under the gauge transform Q is given by 

Ai^R-[Ai + 7i) , (6) 

where ji is defined by the partial derivative of the rotation R w.r.t. the shape coordinates, 
dH/dqi = R-7iX . With Q and (jHI) it is straightforward to see that the quantity dca/dqi — 
Ai X Ca and hence the metric tensor gij (see Eq. (j25|l in the Appendix) is gauge-invariant, 
i.e. independent of the choice of the rotation functions R(g^, . . . , g^). As noted in the 
introduction it can be shown that closed paths in shape space are associated with a change 
in orientation of the molecule. According to Eq. in the Appendix the (gauge-dependent) 
infinitesimal rotation vector d(f)Q associated with the variation of shape coordinates dq^ is 
given by d(f)Q = —Aidq^. The net rotation S generated by moving through a closed path 
in shape space can be calculated by accumulating these infinitesimal rotations along the 
path. However, because of the non-Abelian nature of the rotation group SO (3), i.e. the 
fact that rotations do not commute if their rotation axes are different, it is only possible 
to express S in terms of a path-ordered product 11 . As a measurable quantity, S is of 
course gauge-invariant and also independent of the parametrization of the closed path. It is 
possible to derive an explicit expression for S in the case of infinitesimal small loops where a 
generalized version of Stokes' theorem can be applied |11|. A consequence of the existence of 
orientational changes associated with closed loops in shape space is the fact that the shape 
manifold defined by the metric gij cannot be embedded in the 3A^-dimensional coordinate 
space as a single- valued function, i.e. the shape manifold does not only exhibit curvature but 
also torsion |12j . 

It is shown in the Appendix that the classical canonical partition function Z in shape 
space reads 

Z = 8n'Jdq'---J dq^{det~g,,Y/\detMy/' e-^^^"?' (7) 
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where the prefactor Stt^ stems from the integration over all possible orientations of the 
molecule. Note, that no translational degrees of freedom have been taken into account in 
deriving Z since the atom coordinates are defined as center of mass coordinates. How- 
ever, because of the separability of center of mass movements in the equation of motion, 
translational effects can simply be included by an additional prefactor of Z . In the case of 
orientational constraints the prefactor Svr^ would have to be modified accordingly. 
The terms (det ^ij)^/^ ^nd (det M)^/^ 

can be included in the Boltzmann factor as effective 
conformation dependent energy terms Fq and Fm-, 



where 



and 



FM{q\ 



knT 



log(det Qij) 



log(detM). 



(8) 



(9) 



(10) 



for a number of molecules. In the remainder of the 
are given by the dihedral angles 



Below, we will calculate Fq and F, 
paper it is assumed that the shape coordinates q^, . . . ,q 
01, ... , 0Af , where M is the number of rotatable bonds. 

The effect of the correction factors {det gijY^"^ and (detM)^/^ becomes visible in the 
probability distributions Pn{<P) of the dihedral angles ipn in the absence of atomic interactions. 
If the correction factors are omitted in Eq. (0) values of the dihedral angles are uniformly 
distributed with Pj(0) = (27r)~^, a fact which is sometimes used to test microreversibility 
in Monte Carlo simulations with complicated concerted-rotation move sets jl]. Here, these 
distributions are given by 



^n(0) 



where 



1 

'z 

X detM( 



■>n-l, 



0M 



M [det gij{(pi, 

1/2 



1 /2 

^n-l, 0, 0n+l, • • • , 0m)] X 



H/ [det gij{(j)i, . . .,0m)] 



1/2 



detM( 



4>m] 



1/2 



3 Correction factors for small molecules 

In this section we calculate the conformation-dependent correction factors in the partition 
function ((Zj) and the related quantities Fq and Fm at T = 300 K for the peptides Ace- 
(Ala)i^2,3-Nme as well as for the pentapeptide Met-enkephalin with the sequence Tyr-Gly- 
Gly-Phe-Met. We only consider dihedral angles at fully rotatable bonds (cu-angles are set to 
180°) that do not connect to methyl- or NH3"-groups. Conformations of the polyalanines are 
parametrized by 2, 4, and 6 dihedral angles, while 17 dihedral angles have to be taken into 
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account for Met-enkephalin. Potential energies have been calculated with the Tinker software 
package JH] using the OPLS all-atom forcefield fT^ (without electrostatic interaction cut off 
and with e = 1) in conjunction with a GB/SA implicit-solvent term [15\ . 

Fig. 1 shows the terms (det^jj)^/^ and (detM)^/^ as functions of the dihedral angles 
and ip for alanine dipeptide (Ace-Ala-Nme). It is seen that both terms show relative 
variations of about 25% and 30% respectively. The corresponding variations of Fq and Fm 
are 0.17 kcal/mol and 0.21 kcal/mol. Note that only these relative variations matter for 
the calculation of thermodynamic quantities. The distributions of the effective energies Fq 
and Fm for Ace-(Ala)2,3-Nme and Met-enkephalin are shown in Fig. 2, 3, and 4, where each 
point [Fg, Fm) corresponds to a low-energy molecular conformation. For the polyalanines 
conformations have been obtained on an equidistant grid in dihedral angle space with grid 
spacing = vr/S. In total 10"^ conformations have been sampled for Ace-(Ala)2-Nme, 
and 10^ conformations for Ace-(Ala)3-Nme. Of these conformations, Fig. 2 (Fig. 3) shows 
those 200 (1000) with the lowest potential energies. The insets in Fig. 2 and 3 give the 
corresponding cumulative energy distributions. 

Conformations of the larger and more flexible molecule Met-enkephalin cannot be ob- 
tained by explicit enumeration. In this case conformations have been generated by randomly 
choosing dihedral angles and subsequently applying an annealing and energy minimization 
procedure that uses short stochastic dynamics runs at decreasing temperatures followed by 
steepest-descend minimization, and thereby mostly avoiding steric clashes and other high- 
energy situations. This way, 2000 conformations have been generated, of which those 200 
with the lowest potential energies have been plotted in Fig. 4 along with the cumulative 
energy distribution shown in the inset. It is known that Met-enkephalin does not adopt a 
single conformation in aqueous solution at room temperature [IS] , last but not least because 
of its biological function as a neuro-transmitter binding to a number of different receptors. 
One may therefore expect, that many conformations generated by the procedure described 
above can actually be assumed by the molecule under biological conditions. 

Variations AFq and AFm of the energies Fg and Fm in Fig. 2, 3, and 4 (here defined 
by the difference between maximal and minimal values in the data) strongly depend on 
the size of the molecule. While for Ace-(Ala)2-Nme and Ace-(Ala)3-Nme these variations 
are small {AFg = 0.29 kcal/mol and AFm = 0.21 kcal/mol in the first case, and AFg = 
0.53 kcal/mol and AFm = 0.35 kcal/mol in the latter one), they become more significant 
for Met-enkephalin, where Fg and Fm vary by AFg = 2.45 kcal/mol and AFm = 0.77 
kcal/mol according to the data plotted in Fig. 4. Fig. 5 shows those conformations of Met- 
enkephalin that correspond to the numbered data points in Fig. 4 where Fg and Fm and 
therefore (det g^jY^"^ and (detM)^/^ assume extreme values. Clearly, conformations with 
small factor (detM)^/^ exhibit a small radius of gyration and are therefore more compact 
than conformations with large values of this term. The interpretation of the term (det^jj)^/^ 
is more difficult and its values depend on the details of the molecule. However, consistent with 
the pictures shown in Fig. 5 it can be said that more "rigid" , stretched conformations lead to 
a smaller factor (det (jijY^"^ than "sloppier" , curved conformations, where small changes in the 
dihedral angles result in larger changes in the atom coordinates. Since conformations with 
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large radius of gyration are usually more stretched than those with small radius of gyration 
we may explain the weak negative correlation between Fq and Fm indicated by the straight 
lines (obtained by a least squares fit) in Fig. 2, 3, and 4 (correlation coefficient = —0.5427, 
—0.5405, and —0.1511). We do not find correlations between the potential energy E and 
Fg or Fm except for the case of Met-enkephalin, where E is negatively correlated with Fm 
because more compact conformations forming hydrogen bonds or salt bridges are found to 
be energetically favored by several kcal/mol. 

The probability distributions Pn{<P) of dihedral angles in the absence of atomic interac- 
tions defined in Eq. can be calculated by first obtaining a series of conformations where 
all values of dihedral angles are sampled with equal probability. The histograms Hn{k) 
approximating Pn{4'), are then given by 

^ ^ (5fc(0„)(det^,,)V2(detM)i/2\ 

Hn\k) = -, z ^ 

((det^i,-)'/'(detM)i/2^ 

where (.) denotes an average over the series of conformations, k = 0, . . . , K — 1 {K being 
the number of bins), and 

r for < ^ 
5.(0)= 1 for ^<0<M^ . 
[ for < 

It is KHn{k) ^ 27rP„(0) for e [^,^^^|^). Fig. 6 shows the so-obtained distributions 
for all 17 dihedral angles of Met-enkephalin using 10^ randomly sampled conformations. It 
is seen that deviations from the uniform distribution Pn(0) = (27i")~^ of up to 40% occur for 
some of the dihedral angles (most prominently for 03, 0ii, and 0i6, where the indices refer 
to the bond indices shown in Fig. 5). This again demonstrates that the correction factors 
{deigijf/^ and (detM)^/^ 

are significant for Met-enkephalin. 



4 Protein folding 

In this section we discuss the terms Fq and Fm in the context of folded and unfolded states 
of a protein with radii of gyration and /^^n^idod^ first, an estimate of Fq is given 

considering only backbone dihedral angles (0 and ^/'-angles) with the assumption that the 
contributions of side chain dihedrals to the metric tensor gij are roughly the same for folded 
and unfolded states. In order to calculate the determinant of gij we have to estimate its 
eigenvalues A. One may argue, that the corresponding eigenvectors are localized on the 
peptide chain and span about seven adjacent torsion angles, having in mind that six is the 
number of angles needed to solve the so-called rebridging problem for polymer chains |17j . 
This means, that there is one free parameter in the concerted motion of seven adjacent 0/'0- 
angles leaving the rest of the protein unchanged. We therefore assume that an eigenvector 
of gij corresponds to a loop of about seven residues with length £ and with endpoints A 
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and B. Let r be the average distance of A and B, and ^ the length scale of the average 
lateral fluctuations of such a loop w.r.t. the axis AI3. Since this is a short loop (w.r.t. to 
the persistence length scale) we estimate the scaling behavior of ^ to be ^ ~ \/W—r^. The 
corresponding eigenvalue of Qij should scale as A ~ m^^, where m is the mass of the loop, 
and therefore det gij ~ (m^^)^^, where is the number of residues. Assuming r <^ i and 
taking r to be proportional to the radius of gyration we obtain 



AFg = - F/?"^^^ ^ NkeT 



^folded \ 



R 



unfolded 



1 



Using a value of 1.6 for the ratio of the radii of gyration in the unfolded and folded states from 
|18j . and very roughly estimating r'°''*°'*/£ ?s 0.4 from the analysis of Cq distances of protein 
structures, we flnd AF^ ~ 7.5 kcal/mol at T = 300 K for a protein with = 50 residues. 
One may therefore conclude that energy corrections due to the metric of the dihedral angle 
shape manifold are a signiflcant energetic contribution and should be taken into account in 
folding simulations using Monte Carlo moves based on dihedral angles. 

We now turn to the term det M. Since this term arises form the thermal equilibration 
of angular momentum its contribution is for instance not included in molecular dynamics 
simulations that enforce L = for the protein. In order to estimate Fm we have to consider 
inertial effects of the solvent as well. While these effects should be negligible for small 
molecules it is reasonable to assume that in the unfolded state of a globular protein many 
water molecules are effectively trapped and therefore rigidly coupled. For our estimate we 
therefore consider two limiting cases: complete viscous coupling of the solvent (i.e. neglecting 
inertial effects of the solvent at all) and complete rigid coupling, (i.e. all solvent molecules in 
the sphere deflned by the radius of gyration R are rigidly coupled to the protein). In the flrst 
case the components M of the diagonalized moment of inertia tensor M scale as M ~ mR? 
where m = const, is the mass of the protein, while in the latter case the mass m itself scales 
as m ~ i?'^, so that M ^ R^ assuming a uniform mass density of protein and solvent. With 
det M ~ {mR^Y 'we derive 



R 



unfolded 



where k = 3 in the case of viscously coupled solvent and k = 15/2 for rigid coupling. At 
T = 300 K we get AFm ~ —0.85 kcal/mol in the flrst case and AF^ ^ —2.1 kcal/mol in 
the second. 

Note, that AFq and AFm have different signs: Unfolded or stretched conformations have 
a smaller value of det gtj because the loops are stiffer and therefore dihedral angle variations 
result in smaller variations in Cartesian space, while the moment of inertia is the larger the 
larger the characteristic length scale of the molecule is. This conflrms the trend that had 
already been observed for small molecules discussed in the previous section. It is seen that 
I AFg I is by about a factor of 5 larger than |AFm| for a protein with 50 residues. While AFq 
depends linearly on the chain length, AFm is independent of the size of the protein. 
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5 Discussion and conclusion 



It has been shown in the present paper that statistical mechanical sampling of molecular 
conformations has to account for the correct metric gij in conformational space as well as 
the conformation-dependence of the moment of inertia tensor M, both of which can be 
expressed in terms of effective conformation-dependent energy contributions, Fq and Fm- 
Using dihedral angles as internal coordinates, the distribution of these energy contributions 
for low-energy conformations has been calculated numerically for a number of short peptides. 
While their influence is small for molecules with few rotatable bonds, we find variations of 
Fg and Fm of about 2.45 kcal/mol (for Fq) and 0.77 kcal/mol (for Fm) for the pentapeptide 
Met-enkephalin. A rough estimate of both terms in the folded and unfolded states of a protein 
with 50 residues leads to the significant energy difference of 7.5 kcal/mol for Fq and between 
-0.85 kcal/mol and -2.1 kcal/mol for Fm- This shows that both correction terms should 
be taken into account when dihedral coordinates are used in thermodynamic Monte Carlo 
simulations of protein folding in order to accurately calculate thermodynamic quantities. 
Since Fq + Fm is larger in the unfolded state of a protein, Monte Carlo simulations that 
omit these corrections lead to free energy differences between unfolded and folded states that 
are too small. These simulations therefore underestimate the stability of the folded state. 
The efficient implementation of metric and moment of inertia-related correction terms within 
a Monte Carlo algorithm will be subject of a future publication. 

A related problem, where the consideration of the proper metric in conformational space 
is important, is the estimate of thermodynamic quantities from a given, finite ensemble of 
molecular conformations. Such ensembles can be generated for molecules with not too many 
rotatable bonds and allow for the calculation of free energies and conformational entropies 
which are otherwise difficult to access in importance sampling-based methods. An example 
is the calculation of protein side chain free energies ^] from rotamer libraries [21]. In 
principle it is possible to estimate thermodynamic properties such as entropies from any given 
ensemble of conformations without referring to underlying internal coordinates. Distances 
according to the metric Qij can be approximated by the mass-weighted RMSD between two 
conformations minimized w.r.t. rotations and translations. However, the calculation of the 
partition function would then require a triangulation procedure in order to calculate volume 
elements in conformational space. 
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Appendix 



In the following we will derive the classical canonical partition function Z in shape space. 
At first, an expression for the Hamiltonian 7i is obtained which separates shape space and 
rotational contributions. A more detailed derivation of ?i and a discussion of the related 
theory can be found in the review article JT]. In order to keep equations as simple as 
possible three different notations are used to distinguish between vectors in three-dimensional 
Cartesian space, sums over atom positions Cq,, a = 1 . . . and sums over shape coordinates 
g*, z = 1 . . . M. Three-dimensional vectors are given in a vector notation with dots (■) and 
crosses (x) for scalar and vector products. The Einstein sum convention is employed for 
summation over latin indices involving the shape coordinates (f . Sums over atom positions 
(greek indices) are written explicitly. We also define a 3A^-dimensional vector (denoted 
in bold face), c = (ci, . . . , c/v), that contains the (three-dimensional) atom positions as 
components. Vector operations, ■ and x, acting on 3A^-dimensional vectors are meant to 
act on each vector component independently. Finally, < | > is a scalar product in the 
3A^-dimensional vector space defined by 

< u|v >= ^m„u„ ■ {^a, (12) 

where the associated norm is ||u||^ =< u|u >. 

Let a molecular conformation be given by a vector c and consider a (kinematic) variation 
of shape coordinates {c?g*} and the resulting variation dc in Cartesian space, 

^"^^S^'-'-t^^ '''' 

The goal is to separate do. into a pure shape variation part cy and a pure rotational part c^, 

dc = dc\\ + dc_i, (14) 

such that both parts are orthogonal to each other 

< rfc|||rfcx >= 0. (15) 

This can be achieved by defining an infinitesimal rotation Ti = I + d(f)X that acts on c 
where (i0 is an angular rotation vector. Then, 

— * 

Rc — c = d(f) X c 

defines a three-dimensional linear subspace V which is parametrized by d(j), and dc± is the 
orthogonal projection of dc on V. The vector dc± can be calculated by minimizing the 
infinitesimal distance 

D{d(f) = \\dc + d$xc\\, (16) 



11 



where the minimum Dq is defined by 

Do = D{d(fo) = min {D{d$)} . (17) 



This leads to the result 



where 



M ^ ■ ^ TTladCa X 



M = 5^m<, (19) 

a 

is the moment of inertia tensor, and we have 

dc± = —d(j)Q X c (20) 

and 

— * 

dc\\ = dc + d(j)Q X c. (21) 
The rotation vector (i0o can be expressed in terms of a gauge potential Aj, 

rf0o = -Adq\ (22) 

— * 

where Ai is defined as 

A, = M-i^m„4x^. (23) 

Note, that the variation dc\\ is independent of the freedom in the choice of the coordinate 
functions Ca{q^, ■ ■ ■ , q^^) described in Eq. Q because it has been obtained by minimizing the 
distance D, while dc± is non-unique and depends on this choice. According to Eq. (fTB|) and 
(fTTjl the infinitesimal distance Dq corresponds to the mass-weighted RMSD between c + dc 
and c minimized w.r.t. rotations. 

Let us now use a body frame of reference and assume that the Cartesian coordinates of 
the system are momentarily described by the vector c. Consider a variation of Cartesian 
coordinates dr which is the sum of a shape contribution dc and an external infinitesimal 
rotation given by d(f) x c, 

dr = dc + d(f) X c. 
Using Eqs. and (PT|) we can write 

dr = (ic|j + {d(j) — d(f)o) x c. (24) 

Because the rotational part of dr, [dcf) — d(f)Q) x c, is an element of the linear space V, it is 
also orthogonal to dc\\. This can be independently verified using the definition of the scalar 
product (IT^ . 

The kinetic energy of the system is given by 
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which can be expressed in terms of the scalar product ()12|) as, 



2\dt 



dr 

di 



The orthogonahty between the shape and rotational part of dr described above now allows 
us to separate both contributions in the expression for T, 

where the angular velocity vector is defined by = d(f)/dt and we inserted the expres- 
sions (|2U|) . (|2H) . and (j22p . With the definition of the metric tensor 

the kinetic energy T can finally be written as 

T = ^g.jq'q' + ^('^ + ^^^0 ■ M ■ (cJ + (26) 

We now assume that the system is only subject to forces between the atoms, which means 
that the potential energy V = V{q^, . . . , g*^) is a function of the shape coordinates alone 
and the Lagrangian can be written as 

C = T-V{q\...,q''). (27) 

From Eq. ()26|) the angular momentum L is obtained by 

L = ^ = M-{uj + Aq'), (28) 

and the generalized momenta Pi associated with the shape coordinates are 

dT 

Pi=Q^r= 9ijQ' +L-Ai. (29) 

In the case of zero angular momentum, i.e. the absence of any external torque, the rotation 
generated by an internal motion is therefore given by = — A^g*. Comparing this with 
Eq. ()22|) shows that the shape manifold itself is actually defined by the condition L = 

Now, the Hamiltonian Ti = T + V can be obtained by solving ()28p and (j^^ for the 
angular and shape coordinate velocities and inserting them into the kinetic energy 



H = \g'Kv^ - L ■ Ai){j>, - L ■ Aj) + 11 ■ ■ L + V{q\ ...,q^'). (30) 
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Here, g'^^ is the inverse metric tensor, and and pi are canonically conjugated variables. It is 
not possible to define coordinates which are canonically conjugated to the angular momentum 
L ^T], because the angular velocity u cannot be written as a simple time derivative of angular 
coordinates. Nevertheless, in order to derive the partition function Z from Ti, it is necessary 
to describe the rotational part of the Hamiltonian Ti. with canonical variables as well. This 
can be achieved by using Euler angles 9, ip, ip and their canonically conjugated momenta pe, 
p^p, p^. Here, it is convenient to use the principle axes of the moment of inertia tensor M as 
the basis of the underlying coordinate system (where M is diagonal). With 

9 cos ip + Tp sin %p sin 9 
uj = \ 9 sin ip — ip cos ip sin 9 

(p> + tp cos 9 



we have 



cos Ip sin 9 Pe + sm'lp{p^f, — cos 9 p^) 
sin Ip sin 9 pg — cos ip{p^ — cos 9 p^) 
sin 9 p^ 



sm.9 

and the rotational part of kinetic energy, \L ■ M^^ ■ Z, becomes 



\ { [{p^p — p^cos9)sin%p + pqs\ti9 cos'ip]^ [{p^p — p^cos9) cos^p — p0sin9 sm'ip\' 




2 [ Misin^^ M2sin^0 

where Mi, M2, and M3 are the diagonal components of the moment of inertia tensor. The 
classical canonical partition function Z of the system is then given by (no sum convention!) 

dpedp^dp^d9d^d^ ^_pH(,\...,,^^,p, PM,e,^.^.P,.P..P,„)^ 

/ -, \M+3 

where /3 is the inverse temperature. The factor ( y-r ) comes from the standard "coarse- 




graining" procedure in phase space which is a way to derive the correct quantum-mechanical 
prefactor of classical partition functions [1^]. In this procedure, as a consequence of the 
Heisenberg uncertainty relation, the phase space is devided into cells with volume ApAq = 
27rh, where p and q are arbitrary pairs of conjugated generalized canonical coordinates and 
momenta. 

In order to obtain the partition function in the shape coordinates alone we first integrate 
i]'dl\i over the generalized momenta Pi, followed by an integration over the Euler momenta 
Pg, Pip, p^. These integrals a purely Gaussian and can readily be performed analytically. 
Finally, the integration over all orientations of the system, i.e. the Euler angles themselves, 
is trivial and results in a numerical factor Svr^. This leads to the final result 



M+3 



1 



in'i-^A ' fdq'--- J dq^' (det g,,f'idetMf' e'^^^^^' 
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It is seen, that the factor (det gijY^"^ which depends on the shape coordinates results 
from the equihbration of the momenta in shape space, while the factor (det M)^''^ originates 

from the equilibration of angular momentum. Upon absorbing the factor f ^^^2^ j ^ in the 
determinants we arrive at the expression for Z given in Eq. (|7j). 
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Fig. 1. The terms (det^y)^/^ (top) and (detM)^/^ (bottom) as function of the dihedral angles (j) 
and V for alanine dipeptide (Ace-Ala-Nme). 
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Fig. 2. Effective energies Fg and Fm for low-energy conformations of Ace-(Ala)2-Nme. The inset 
shows the cumulative distribution of potential energies of the sampled conformations. 
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Fq [kcal/mol] 

Fig. 3. Effective energies Fq and Fm for low-energy conformations of Ace-(Ala)3-Nme. The inset 
shows the cumulative distribution of potential energies of the sampled conformations. 
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Fig. 4- Effective energies Fq and Fm for low-energy conformations of Met-enkephalin. The inset 
shows the cumulative distribution of potential energies of the sampled conformations. 



19 




Fig. 5. Conformations of Met-enkephalin corresponding to the numbered data points in Fig. 4 for 
which either Fq or Fm take maximal or minimal values. 
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Fig. 6. Probability distributions of dihedral angles 0i for Met-enkephalin in the absence of atomic 
interactions. The indices i correspond to the bond indices given in Fig. 5. 
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